61 research outputs found

    μ-MAR: Multiplane 3D Marker based Registration for depth-sensing cameras

    Get PDF
    Many applications including object reconstruction, robot guidance, and. scene mapping require the registration of multiple views from a scene to generate a complete geometric and appearance model of it. In real situations, transformations between views are unknown and it is necessary to apply expert inference to estimate them. In the last few years, the emergence of low-cost depth-sensing cameras has strengthened the research on this topic, motivating a plethora of new applications. Although they have enough resolution and accuracy for many applications, some situations may not be solved with general state-of-the-art registration methods due to the signal-to-noise ratio (SNR) and the resolution of the data provided. The problem of working with low SNR data, in general terms, may appear in any 3D system, then it is necessary to propose novel solutions in this aspect. In this paper, we propose a method, μ-MAR, able to both coarse and fine register sets of 3D points provided by low-cost depth-sensing cameras, despite it is not restricted to these sensors, into a common coordinate system. The method is able to overcome the noisy data problem by means of using a model-based solution of multiplane registration. Specifically, it iteratively registers 3D markers composed by multiple planes extracted from points of multiple views of the scene. As the markers and the object of interest are static in the scenario, the transformations obtained for the markers are applied to the object in order to reconstruct it. Experiments have been performed using synthetic and real data. The synthetic data allows a qualitative and quantitative evaluation by means of visual inspection and Hausdorff distance respectively. The real data experiments show the performance of the proposal using data acquired by a Primesense Carmine RGB-D sensor. The method has been compared to several state-of-the-art methods. The results show the good performance of the μ-MAR to register objects with high accuracy in presence of noisy data outperforming the existing methods.This work has been supported by grant University of Alicante projects GRE11-01 and grant Valencian Government GV/2013/005

    A Novel Prediction Method for Early Recognition of Global Human Behaviour in Image Sequences

    Get PDF
    Human behaviour recognition has been, and still remains, a challenging problem that involves different areas of computational intelligence. The automated understanding of people activities from video sequences is an open research topic in which the computer vision and pattern recognition areas have made big efforts. In this paper, the problem is studied from a prediction point of view. We propose a novel method able to early detect behaviour using a small portion of the input, in addition to the capabilities of it to predict behaviour from new inputs. Specifically, we propose a predictive method based on a simple representation of trajectories of a person in the scene which allows a high level understanding of the global human behaviour. The representation of the trajectory is used as a descriptor of the activity of the individual. The descriptors are used as a cue of a classification stage for pattern recognition purposes. Classifiers are trained using the trajectory representation of the complete sequence. However, partial sequences are processed to evaluate the early prediction capabilities having a specific observation time of the scene. The experiments have been carried out using the three different dataset of the CAVIAR database taken into account the behaviour of an individual. Additionally, different classic classifiers have been used for experimentation in order to evaluate the robustness of the proposal. Results confirm the high accuracy of the proposal on the early recognition of people behaviours.This work was supported in part by the University of Alicante, Valencian Government and Spanish government under grants GRE11-01, GV/2013/005 and DPI2013-40534-R

    Three-dimensional planar model estimation using multi-constraint knowledge based on k-means and RANSAC

    Get PDF
    Plane model extraction from three-dimensional point clouds is a necessary step in many different applications such as planar object reconstruction, indoor mapping and indoor localization. Different RANdom SAmple Consensus (RANSAC)-based methods have been proposed for this purpose in recent years. In this study, we propose a novel method-based on RANSAC called Multiplane Model Estimation, which can estimate multiple plane models simultaneously from a noisy point cloud using the knowledge extracted from a scene (or an object) in order to reconstruct it accurately. This method comprises two steps: first, it clusters the data into planar faces that preserve some constraints defined by knowledge related to the object (e.g., the angles between faces); and second, the models of the planes are estimated based on these data using a novel multi-constraint RANSAC. We performed experiments in the clustering and RANSAC stages, which showed that the proposed method performed better than state-of-the-art methods

    A Compilation of Methods and Datasets for Group and Crowd Action Recognition

    Get PDF
    The human behaviour analysis has been a subject of study in various fields of science (e.g. sociology, psychology, computer science). Specifically, the automated understanding of the behaviour of both individuals and groups remains a very challenging problem from the sensor systems to artificial intelligence techniques. Being aware of the extent of the topic, the objective of this paper is to review the state of the art focusing on machine learning techniques and computer vision as sensor system to the artificial intelligence techniques. Moreover, a lack of review comparing the level of abstraction in terms of activities duration is found in the literature. In this paper, a review of the methods and techniques based on machine learning to classify group behaviour in sequence of images is presented. The review takes into account the different levels of understanding and the number of people in the group

    Best Viewpoint Tracking for Camera Mounted on Robotic Arm with Dynamic Obstacles

    Get PDF
    The problem of finding a next best viewpoint for 3D modeling or scene mapping has been explored in computer vision over the last decade. This paper tackles a similar problem, but with different characteristics. It proposes a method for dynamic next best viewpoint recovery of a target point while avoiding possible occlusions. Since the environment can change, the method has to iteratively find the next best view with a global understanding of the free and occupied parts. We model the problem as a set of possible viewpoints which correspond to the centers of the facets of a virtual tessellated hemisphere covering the scene. Taking into account occlusions, distances between current and future viewpoints, quality of the viewpoint and joint constraints (robot arm joint distances or limits), we evaluate the next best viewpoint. The proposal has been evaluated on 8 different scenarios with different occlusions and a short 3D video sequence to validate its dynamic performance.Comment: 10 pages, 6 figures, poster in 3DV conferenc

    Architecture for automatic recognition of group activities using local motions and context

    Get PDF
    Currently, the ability to automatically detect human behavior in image sequences is one of the most important challenges in the area of computer vision. Within this broad field of knowledge, the recognition of activities of people groups in public areas is receiving special attention due to its importance in many aspects including safety and security. This paper proposes a generic computer vision architecture with the ability to learn and recognize different group activities using mainly the local group’s movements. Specifically, a multi-stream deep learning architecture is proposed whose two main streams correspond to a representation based on a descriptor capable of representing the trajectory information of a sequence of images as a collection of local movements that occur in specific regions of the scene. Additional information (e.g. location, time, etc.) to strengthen the classification of activities by including it as additional streams. The proposed architecture is capable of classifying in a robust way different activities of a group as well to deal with the one-class problems. Moreover, the use of a simple descriptor that transforms a sequence of color images into a sequence of two-image streams can reduce the curse of dimensionality using a deep learning approach. The generic deep learning architecture has been evaluated with different datasets outperforming the state-of-the-art approaches providing an efficient architecture for single and multi-class classification problems

    When Deep Learning Meets Data Alignment: A Review on Deep Registration Networks (DRNs)

    Get PDF
    Registration is the process that computes the transformation that aligns sets of data. Commonly, a registration process can be divided into four main steps: target selection, feature extraction, feature matching, and transform computation for the alignment. The accuracy of the result depends on multiple factors, the most significant are the quantity of input data, the presence of noise, outliers and occlusions, the quality of the extracted features, real-time requirements and the type of transformation, especially those ones defined by multiple parameters, like non-rigid deformations. Recent advancements in machine learning could be a turning point in these issues, particularly with the development of deep learning (DL) techniques, which are helping to improve multiple computer vision problems through an abstract understanding of the input data. In this paper, a review of deep learning-based registration methods is presented. We classify the different papers proposing a framework extracted from the traditional registration pipeline to analyse the new learning-based proposal strengths. Deep Registration Networks (DRNs) try to solve the alignment task either replacing part of the traditional pipeline with a network or fully solving the registration problem. The main conclusions extracted are, on the one hand, 1) learning-based registration techniques cannot always be clearly classified in the traditional pipeline. 2) These approaches allow more complex inputs like conceptual models as well as the traditional 3D datasets. 3) In spite of the generality of learning, the current proposals are still ad hoc solutions. Finally, 4) this is a young topic that still requires a large effort to reach general solutions able to cope with the problems that affect traditional approaches.Comment: Submitted to Pattern Recognitio
    • …
    corecore